Goto

Collaborating Authors

 data aggregation


Q-Learning-Based Time-Critical Data Aggregation Scheduling in IoT

Vo, Van-Vi, Nguyen, Tien-Dung, Le, Duc-Tai, Choo, Hyunseung

arXiv.org Artificial Intelligence

Time-critical data aggregation in Internet of Things (IoT) networks demands efficient, collision-free scheduling to minimize latency for applications like smart cities and industrial automation. Traditional heuristic methods, with two-phase tree construction and scheduling, often suffer from high computational overhead and suboptimal delays due to their static nature. To address this, we propose a novel Q-learning framework that unifies aggregation tree construction and scheduling, modeling the process as a Markov Decision Process (MDP) with hashed states for scalability. By leveraging a reward function that promotes large, interference-free batch transmissions, our approach dynamically learns optimal scheduling policies. Simulations on static networks with up to 300 nodes demonstrate up to 10.87% lower latency compared to a state-of-the-art heuristic algorithm, highlighting its robustness for delay-sensitive IoT applications. This framework enables timely insights in IoT environments, paving the way for scalable, low-latency data aggregation.


Leveraging Cloud-Fog Automation for Autonomous Collision Detection and Classification in Intelligent Unmanned Surface Vehicles

Tran, Thien, Nguyen, Quang, Kua, Jonathan, Tran, Minh, Luu, Toan, Hoang, Thuong, Jin, Jiong

arXiv.org Artificial Intelligence

Industrial Cyber-Physical Systems (ICPS) technologies are foundational in driving maritime autonomy, particularly for Unmanned Surface Vehicles (USVs). However, onboard computational constraints and communication latency significantly restrict real-time data processing, analysis, and predictive modeling, hence limiting the scalability and responsiveness of maritime ICPS. To overcome these challenges, we propose a distributed Cloud-Edge-IoT architecture tailored for maritime ICPS by leveraging design principles from the recently proposed Cloud-Fog Automation paradigm. Our proposed architecture comprises three hierarchical layers: a Cloud Layer for centralized and decentralized data aggregation, advanced analytics, and future model refinement; an Edge Layer that executes localized AI-driven processing and decision-making; and an IoT Layer responsible for low-latency sensor data acquisition. Our experimental results demonstrated improvements in computational efficiency, responsiveness, and scalability. When compared with our conventional approaches, we achieved a classification accuracy of 86\%, with an improved latency performance. By adopting Cloud-Fog Automation, we address the low-latency processing constraints and scalability challenges in maritime ICPS applications. Our work offers a practical, modular, and scalable framework to advance robust autonomy and AI-driven decision-making and autonomy for intelligent USVs in future maritime ICPS.


An Adaptive ML Framework for Power Converter Monitoring via Federated Transfer Learning

Kakosimos, Panagiotis, Saberi, Alireza Nemat, Peretti, Luca

arXiv.org Artificial Intelligence

-- This study explores alternative framework configuration s for adapting thermal machine learning (ML) models for power converters b y combining transfer learning (TL) and federated learning (FL) in a piecewise manner . This approach inherently addresses challenges such as varying operating conditions, data sharing limitations, and security implications. The framework starts with a base model that is incrementally adapted by multiple clients via adapting three state - of - the - art domain adaptation techniques: Fine - tuning, Transfer Component Analysis (TCA), and Deep Domain Adaptation (DDA). The Flower framework is employed for FL, using Federated Averaging for aggregation. Validation with field data demonstrates that fine - tuning offers a straightforward TL approach with high accuracy, making it suitable for practical applications. Benchmarking results reveal a comprehensive comparison of thes e methods, showcasing their respective strengths and weaknesses when applied in different scenarios. L ocally hosted FL enhances performance when data aggregation is not feasible, while cloud - based FL becomes more practical with a significant increase in the number of clients, addressing scalability and connectivity challenges.


Unified Locational Differential Privacy Framework

Priyanshu, Aman, Maurya, Yash, Ganesh, Suriya, Tran, Vy

arXiv.org Artificial Intelligence

Aggregating statistics over geographical regions is important for many applications, such as analyzing income, election results, and disease spread. However, the sensitive nature of this data necessitates strong privacy protections to safeguard individuals. In this work, we present a unified locational differential privacy (DP) framework to enable private aggregation of various data types, including one-hot encoded, boolean, float, and integer arrays, over geographical regions. Our framework employs local DP mechanisms such as randomized response, the exponential mechanism, and the Gaussian mechanism. We evaluate our approach on four datasets representing significant location data aggregation scenarios. Results demonstrate the utility of our framework in providing formal DP guarantees while enabling geographical data analysis.


Data Aggregation for Hierarchical Clustering

Schubert, Erich, Lang, Andreas

arXiv.org Machine Learning

Hierarchical Agglomerative Clustering (HAC) is likely the earliest and most flexible clustering method, because it can be used with many distances, similarities, and various linkage strategies. It is often used when the number of clusters the data set forms is unknown and some sort of hierarchy in the data is plausible. Most algorithms for HAC operate on a full distance matrix, and therefore require quadratic memory. The standard algorithm also has cubic runtime to produce a full hierarchy. Both memory and runtime are especially problematic in the context of embedded or otherwise very resource-constrained systems. In this section, we present how data aggregation with BETULA, a numerically stable version of the well known BIRCH data aggregation algorithm, can be used to make HAC viable on systems with constrained resources with only small losses on clustering quality, and hence allow exploratory data analysis of very large data sets. This is a preprint of Erich Schubert and Andreas Lang.


OrcoDCS: An IoT-Edge Orchestrated Online Deep Compressed Sensing Framework

Ching, Cheng-Wei, Gupta, Chirag, Huang, Zi, Hu, Liting

arXiv.org Artificial Intelligence

Compressed data aggregation (CDA) over wireless sensor networks (WSNs) is task-specific and subject to environmental changes. However, the existing compressed data aggregation (CDA) frameworks (e.g., compressed sensing-based data aggregation, deep learning(DL)-based data aggregation) do not possess the flexibility and adaptivity required to handle distinct sensing tasks and environmental changes. Additionally, they do not consider the performance of follow-up IoT data-driven deep learning (DL)-based applications. To address these shortcomings, we propose OrcoDCS, an IoT-Edge orchestrated online deep compressed sensing framework that offers high flexibility and adaptability to distinct IoT device groups and their sensing tasks, as well as high performance for follow-up applications. The novelty of our work is the design and deployment of IoT-Edge orchestrated online training framework over WSNs by leveraging an specially-designed asymmetric autoencoder, which can largely reduce the encoding overhead and improve the reconstruction performance and robustness. We show analytically and empirically that OrcoDCS outperforms the state-of-the-art DCDA on training time, significantly improves flexibility and adaptability when distinct reconstruction tasks are given, and achieves higher performance for follow-up applications.


Specification-Guided Data Aggregation for Semantically Aware Imitation Learning

Shah, Ameesh, DeCastro, Jonathan, Gideon, John, Yalcinkaya, Beyazit, Rosman, Guy, Seshia, Sanjit A.

arXiv.org Artificial Intelligence

Advancements in simulation and formal methods-guided environment sampling have enabled the rigorous evaluation of machine learning models in a number of safety-critical scenarios, such as autonomous driving. Application of these environment sampling techniques towards improving the learned models themselves has yet to be fully exploited. In this work, we introduce a novel method for improving imitation-learned models in a semantically aware fashion by leveraging specification-guided sampling techniques as a means of aggregating expert data in new environments. Specifically, we create a set of formal specifications as a means of partitioning the space of possible environments into semantically similar regions, and identify elements of this partition where our learned imitation behaves most differently from the expert. We then aggregate expert data on environments in these identified regions, leading to more accurate imitation of the expert's behavior semantics. We instantiate our approach in a series of experiments in the CARLA driving simulator, and demonstrate that our approach leads to models that are more accurate than those learned with other environment sampling methods.


From Regulation to Data Aggregation: Three Machine Learning Trends to Watch

#artificialintelligence

For over a decade, we've discussed the potential of machine learning (ML) in clinical research to objectively gather and analyze data, optimize trial design, and accelerate drug development. While the opportunities of these technologies get a lot of buzz, there is still a long way to go when it comes to proving they can deliver on their promise and ensuring their development is sustainable long-term. We now find ourselves at a crossroad to improve confidence in ML among pharmaceutical sponsors and clinicians, while finding alternative ways to keep pace with the data-hungry nature of these algorithms. Three key trends will direct the future of ML: regulatory guidance, an emphasis on model traceability as a means to build trust, and new data aggregation and analysis approaches that may help make ML innovation more practical and cost-effective. Until recently, federal oversight over ML's development has been limited, with developers defining best practices based on their own experience.


How machine learning speeds up Power BI reports

#artificialintelligence

The goal of Power BI (and any business intelligence tool) is to replace the hunches and opinions businesses use to make decisions with facts based on data. That means the insights in that data have to be available quickly, so you can pull up a report while people are still discussing what it covers, not five minutes later when everyone has already made up their mind. To make that happen even with large data sets, wherever they're stored, Microsoft now uses machine learning to tune how the data gets accessed. When you have enough data to make decisions with, you need to consolidate and summarize it, while still keeping the original dimensions--so you can look at total sales combined across all departments and get an overview but then slice it by region or month to compare trends. Most Power BI users need these aggregated queries, CTO of Microsoft Analytics Amir Netz told TechRepublic.


How to Leverage Chatbots for Lead Nurturing and Conversions?

#artificialintelligence

What is the most challenging aspect of marketing? Identifying your potential customer, predicting their interest, engaging, and nurturing them for an ultimate buyout is not that easy. This is where Artificial Intelligence can help you analyze the customer pool, identify and segregate lead. Further, you can even nurture them using AI-based algorithms, which helps in delivering high value. Take an example of the famous inbound marketing giant Hubspot.